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WHAT IS CLAIMED IS: 

1. A system for identifying relationships between database records, 
comprising: 

a memory operable to store a plurality of records comprising a first record and 
at least one second record, each record comprising at least one of a plurality of tokens; 
and 

one or more processors collectively operable to: 

determine a weight associated with each of the tokens; 
compare at least one second record to the first record; and 
determine at least one relationship indicator based on the comparison 
and at least one of the weights, the at least one relationship indicator identifying a 
level of relationship between the first record and at least one second record. 



2. The system of Claim 1, wherein the weight associated with one of the 
tokens is inversely proportional to a number of times that the token appears in the 
plurality of records. 



3. The system of Claim 1, wherein the one or more processors are 
collectively operable to determine the weight associated with one of the tokens using 
a formula of: 



Weight = - log 



f Count Token ^ 
Total Token , 



where Count-Token represents a number of times that the token appears in the plurality 
of records, and TotalTokens represents a number of times that all tokens appear in the 
plurality of records. 
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4. The system of Claim 1, wherein the one or more processors are 
collectively operable to compare one of the second records to the first record by: 

identifying any common tokens, a common token comprising one of the 
tokens that appears in both the first record and the second record; and 

identifying a common count value for each common token, the common count 
value representing a minimum number of times that the common token appears in 
either the first record or the second record. 

5. The system of Claim 1, wherein the relationship indicator associated 
with one of the second records when compared to the first record is determined using 
a formula of: 

j 

X (Weight Tokeni * Common Count Token J 
Relationship Indicator = — 

oculc Taiget Record 

where j represents a number of unique common tokens that appear in both the first 
record and the second record, Weight To ken i represents the weight associated with the 
zth common token, Common Count To ken i represents a minimum number of times that 
the ith common token appears in either the first record or the second record, and 
ScoreFirst record represents a record score associated with the first record. 

6. The system of Claim 5, wherein the record score associated with the 
first record is determined using a formula of: 

k 

Record Score = I (Weight 

Token k 

*Count Tokenk ) 

i=i 

where k represents a number of unique tokens associated with the first record, 
Weightroken k represents the weight associated with the kth unique token, and 
CountToken k represents a number of times that the kth unique token appears in the first 
record. 
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7. The system of Claim 1 , wherein: 

each of the plurality of records is associated with at least one document; 

the one or more processors are collectively operable to compare a plurality of 
second records to the first record and determine a plurality of relationship indicators; 
and 

the one or more processors are further collectively operable to: 

select one or more of the second records based on the relationship 

indicators; and 

make the documents associated with the one or more second records 
available to a user. 

8. The system of Claim 7, wherein the one or more processors are 
collectively operable to select the one or more second records based on input from the 
user. 

9. The system of Claim 1, wherein the one or more processors are 
collectively operable to allow a user to select the first record, wherein selecting the 
first record comprises at least one of selecting one of the plurality of records and 
submitting a document that the one or more processors may use to generate the first 
record. 

10. The system of Claim 1, wherein the one or more processors are further 
collectively operable to generate a plurality of text files, each text file associated with 
one of a plurality of documents and comprising the at least one token contained in the 
associated document. 

11. The system of Claim 10, wherein the one or more processors are 
collectively operable to generate the plurality of text files by performing at least one 
of optical character recognition and file conversion on each of the documents. 
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12. The system of Claim 10, wherein the one or more processors are 
further collectively operable to generate the plurality of records, each record 
associated with one of the text files and comprising the at least one token contained in 
the associated text file. 

13. The system of Claim 12, wherein the one or more processors are 
collectively operable to generate one of the records by: 

identifying one-word tokens in one of the text files, the one-word tokens 
comprising individual words in the text file; 

inserting the one- word tokens into the record; 

selecting pairs of one-word tokens in the record, each pair of one- word tokens 
comprising consecutive one-word tokens in the record; 

combining the pairs of one-word tokens to produce two-word tokens; and 
inserting the two-word tokens into the record. 

14. The system of Claim 13, wherein the one or more processors are 
further collectively operable to ignore at least one stop word in the text file when 
identifying one-word tokens in one of the text files. 

15. The system of Claim 12, wherein the one or more processors are 
further collectively operable to: 

replace the tokens in the record with one or more token representations; and 
consolidate the record by ensuring that each unique token or token 
representation appears only once in the record. 

16. The system of Claim 1, wherein the one or more processors are further 
collectively operable to receive one or more documents using at least one of an 
interface coupled to a network, a drive operable to read at least one computer readable 
medium, and a scanner. 
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17. The system of Claim 1, wherein the one or more processors are further 
collectively operable to: 

receive a query from a user; 

identify one or more records that satisfy the query; 

identify one or more documents associated with the one or more records; and 
make the one or more documents available to the user. 

18. The system of Claim 1, wherein the one or more processors are further 
collectively operable to: 

generate a token table comprising a plurality of first entries, each first entry 
comprising one of the tokens, a token representation associated with the token, the 
weight associated with the token, and a first count value associated with the token, the 
first count value representing a number of times that the token appears in the plurality 
of records; 

generate a records table comprising a plurality of second entries, each second 
entry associated with one of the records and comprising one of the token 
representations and a second count value, the token representation in the second entry 
associated with one of the tokens contained in the record, the second count value 
representing a number of times that the token associated with the second entry 
appears in the record; and 

generate a records table index comprising a plurality of third entries, each 
third entry associated with one of the records and comprising an identification of at 
least one second entry associated with the record and a record score associated with 
the record. 

19. The system of Claim 18, wherein the one or more processors are 
further collectively operable to convert at least one of the plurality of records, the 
token table, the records table, and the records table index from a first format to a 
second format, the second format used by an external system. 
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20. The system of Claim 1, wherein the one or more processors are further 
collectively operable to categorize each of the records based at least partially on the 
tokens contained in the records and locations of the tokens in the records. 

21. The system of Claim 1, wherein the one or more processors are further 
collectively operable to generate a correlithm object associated with at least one of the 
tokens, the correlithm object comprising a plurality of values defining a first point in a 
particular space, the particular space defined by a plurality of dimensions and 
including a plurality of points. 



22. The system of Claim 2 1 , wherein: 

a distance between the first point and each of the plurality of points in the 
particular space defines a distribution having a standard deviation; and 

a number of values in the correlithm object associated with one of the tokens 
1 5 may be determined using a formula of: 

Number of Values = |~ Weight Token * Standard Deviation] 
where Weight To ken represents the weight associated with the token, and Standard 
Deviation represents the standard deviation of the distribution. 

20 23. The system of Claim 21, wherein the one or more processors are 

further collectively operable to generate a significance vector associated with the 
correlithm object. 

24. The system of Claim 23, wherein the significance vector comprises a 
25 plurality of significance values, each significance value determined using a formula 
of: 

Weight Token * Standard Deviation 

Significance Value = rz — JTT^ 

^ Number of Values 

where Weight To ken represents the weight associated with the token, Standard 

Deviation represents the standard deviation of the distribution, and Number of Values 

30 represents a number of values defining the first point in the correlithm object. 
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25. The system of Claim 21 , wherein: 

the correlithm object comprises a first correlithm object, the first correlithm 
object associated with a first significance vector; 

the one or more processors are collectively operable to generate a first 
correlithm object and a first significance vector for each of the tokens; and 

the one or more processors are farther collectively operable to generate a 
second correlithm object and a second significance vector associated with the first 
record, the second correlithm object comprising at least one of the first correlithm 
objects. 

26. The system of Claim 25, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; and 

a number of first entries in the second correlithm object and a number of 
second entries in the second significance vector are determined using a formula of: 

j 

Number of Entries = £ (Maximum Instances TokeQi ) 

1 = 1 

where j represents a number of unique tokens contained in the plurality of records, 
and Maximum Instances To ken i represents a maximum number of times that the ith 
unique token appears in a single record in the plurality of records. 
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27. The system of Claim 26, wherein: 

each first entry in the second correlithm object is associated with an instance 
of one of the tokens; 

each first entry in the second correlithm object is also associated with one 
second entry in the second significance vector; and 

the one or more processors are collectively operable to generate the second 

significance vector by: 

determining whether the instance of the token associated with one of 
the first entries appears in the first record; 

inserting one or more non-zero significance values into the second 
entry associated with the first entry when the instance of the token associated with the 
first entry appears in the first record; and 

inserting one or more zero significance values into the second entry 
associated with the first entry when the instance of the token associated with the first 
entry does not appear in the first record. 
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28. The system of Claim 27, wherein: 

the one or more processors are collectively operable to generate a second 
correlithm object and a second significance vector for each of the first record and at 
least one second record; and 

the relationship indicator associated with one of the second records when 
compared to the first record is determined using a formula of: 



where N represents the number of first entries in the second correlithm objects and the 
number of second entries in the second significance vectors, ASi represents the 
significance values in the zth second entry of the second significance vector associated 
with the first record, BSi represents the significance values in the zth second entry of 
the second significance vector associated with the second record, Overlap A si,BSi and 
Overlap ASi,ASi each represents an overlap value between the identified significance 
values in the second significance vectors, Stnd. Dist.i represents a standard distance 
associated with the first correlithm objects contained in the zth first entries of the 
second correlithm objects, M represents the number of values in the first correlithm 
objects contained in the zth first entries of the second correlithm objects, Aj represents 
the yth value of the first correlithm object contained in the zth first entry of the second 
correlithm object associated with the first record, and Bj represents they'th value of the 
first correlithm object contained in the zth first entry of the second correlithm object 
associated with the second record. 

29. The system of Claim 28, wherein Overlap A si } BSi and Overlap A si,ASi each 
comprises one of a minimum of the identified significance values in the second 
significance vectors and a product of the identified significance values in the second 
significance vectors. 




£ Overlap^, * Stnd. Dist, 2 -£ (Aj - Bj 



Relationship Indicator = 
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30. The system of Claim 25, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; and 

a number of first entries in the second correlithm object and a number of 
second entries in the second significance vector equal a number of unique tokens in 
the plurality of records. 

3 1 . The system of Claim 30, wherein: 

each first entry in the second correlithm object is associated with one of the 
unique tokens; 

each first entry in the second correlithm object is also associated with one 
second entry in the second significance vector; and 

the one or more processors are collectively operable to generate the second 
significance vector by: 

determining a number of times that the unique token associated with 
the first entry appears in the first record; 

determining a maximum number of times that the unique token 
associated with the first entry appears in a single record in the plurality of records; 

inserting one or more significance values from the first significance 
vector associated with the unique token into the second entry associated with the first 
entry when the unique token associated with the first entry appears the maximum 
number of times in the first record; 

inserting one or more scaled significance values from the first 
significance vector associated with the unique token into the second entry associated 
with the first entry when the unique token associated with the first entry appears at 
least once but less than the maximum number of times in the first record; and 

inserting one or more zero significance values into the second entry 
associated with the first entry when the unique token associated with the first entry 
does not appear in the first record. 
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32. The system of Claim 1 , wherein: 
at least one token comprises a first correlithm object; and 
at least one of the records comprises a second correlithm object, the second 
correlithm object comprising at least one of the first correlithm objects. 
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33. A method for identifying relationships between database records, 
comprising: 

determining a weight associated with each of a plurality of tokens, each token 
contained in at least one of a plurality of records, the plurality of records comprising a 
first record and at least one second record; 

comparing at least one second record to the first record; and 
determining at least one relationship indicator based on the comparison and at 
least one of the weights, the at least one relationship indicator identifying a level of 
relationship between the first record and at least one second record. 

34. The method of Claim 33, wherein the weight associated with one of 
the tokens is determined using a formula of: 



Weight = - log 



Count Token 
. Total Token j 



where Count To ken represents a number of times that the token appears in the plurality 
of records, and Total To kens represents a number of times that all tokens appear in the 
plurality of records. 

35. The method of Claim 33, wherein comparing one second record to the 
first record comprises: 

identifying any common tokens, a common token comprising one of the 
tokens that appears in both records; and 

identifying a common count value for each common token, the common count 
value representing a minimum number of times that the common token appears in 
either record. 
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36. The method of Claim 33, wherein the relationship indicator associated 
with one of the second records when compared to the first record is determined using 
a formula of: 

j 

X (Weight Tokeni * Common Count Token l ) 

Relationship Indicator = — ^ " 

score Target Record 

where j represents a number of unique common tokens that appear in both the first 
record and the second record, Weighty i represents the weight associated with the 
ith common token, Common Count T oken i represents a minimum number of times that 
the rth common token appears in either the first record or the second record, and 
Scorepirst record represents a record score associated with the first record. 

37. The method of Claim 36, wherein the record score associated with the 

first record is determined using a formula of: 

k 

Record Score = £ (Weight Tfltalk *Count Tokenk ) 
i=i 

where k represents a number of unique tokens associated with the first record, 
Weightjoken k represents the weight associated with the kth unique token, and 
CountToken k represents a number of times that the kth unique token appears in the first 
record. 

38. The method of Claim 33, further comprising: 

generating a plurality of text files, each text file associated with one of a 
plurality of documents and comprising the at least one token contained in the 
associated document; and 

generating the plurality of records, each record associated with one of the text 
files and comprising the at least one token contained in the associated text file. 
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39. The method of Claim 38, wherein generating one of the records 
comprises: 

identifying one-word tokens in one of the text files, the one-word tokens 
comprising individual words in the text file; 
5 inserting the one-word tokens into the record; 

selecting pairs of one-word tokens in the record, each pair of one-word tokens 
comprising consecutive one-word tokens in the record; 

combining the pairs of one-word tokens to produce two-word tokens; and 

inserting the two-word tokens into the record. 

10 

40. The method of Claim 33, further comprising: 

generating a token table comprising a plurality of first entries, each first entry 
comprising one of the tokens, a token representation associated with the token, the 
weight associated with the token, and a first count value associated with the token, the 
1 5 first count value representing a number of times that the token appears in the plurality 
of records; 

generating a records table comprising a plurality of second entries, each 
second entry associated with one of the records and comprising one of the token 
representations and a second count value, the token representation in the second entry 

20 associated with one of the tokens contained in the record, the second count value 
representing a number of times that the token associated with the second entry 
appears in the record; and 

generating a records table index comprising a plurality of third entries, each 
third entry associated with one of the records and comprising an identification of at 

25 least one second entry associated with the record and a record score associated with 
the record. 

41. The method of Claim 33, further comprising generating a correlithm 
object associated with at least one of the tokens, the correlithm object comprising a 

30 plurality of values defining a first point in a particular space, the particular space 
defined by a plurality of dimensions and including a plurality of points. 
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42. The method of Claim 41 , wherein: 

a distance between the first point and each of the plurality of points in the 
particular space defines a distribution having a standard deviation; and 

a number of values in the correlithm object associated with one of the tokens 
may be determined using a formula of: 

Number of Values = [Weight Token * Standard Deviation] 
where Weight To ken represents the weight associated with the token, and Standard 
Deviation represents the standard deviation of the distribution. 

43. The method of Claim 41, further comprising generating a significance 
vector associated with the correlithm object. 

44. The method of Claim 43, wherein the significance vector comprises a 
plurality of significance values, each significance value determined using a formula 
of: 

Weight TokeD * Standard Deviation 
Significance Value = Number of Values 

where Weight To ken represents the weight associated with the token, Standard 
Deviation represents the standard deviation of the distribution, and Number of Values 
represents a number of values defining the first point in the correlithm object. 

45. The method of Claim 41, wherein the correlithm object comprises a 
first correlithm object, the first correlithm object associated with a first significance 
vector; 

wherein a first correlithm object and a first significance vector are generated 
for each of the tokens; and 

further comprising generating a second correlithm object and a second 
significance vector associated with the first record, the second correlithm object 
comprising at least one of the first correlithm objects. 
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46. The method of Claim 45, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; and 

a number of first entries in the second correlithm object and a number of 
second entries in the second significance vector are determined using a formula of: 

j 

Number of Entries = ^ (Maximum Instances Token ) 

i - 1 

where j represents a number of unique tokens contained in the plurality of records, 
and Maximum Instancesxoken i represents a maximum number of times that the ith 
unique token appears in a single record in the plurality of records. 

47. The method of Claim 46, wherein: 

each first entry in the second correlithm object is associated with an instance 
of one of the tokens; 

each first entry in the second correlithm object is also associated with one 
second entry in the second significance vector; and 

generating the second significance vector comprises: 

determining whether the instance of the token associated with one of 
the first entries appears in the first record; 

inserting one or more non-zero significance values into the second 
entry associated with the first entry when the instance of the token associated with the 
first entry appears in the first record; and 

inserting one or more zero significance values into the second entry 
associated with the first entry when the instance of the token associated with the first 
entry does not appear in the first record. 
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48. The method of Claim 47, wherein: 

a second correlithm object and a second significance vector are generated for 
each of the first record and at least one second record; 

the relationship indicator associated with one of the second records when 
compared to the first record is determined using a formula of: 



2\ Overlap ASi3Si *[ Stnd. Dist, 2 -£ (a, - B } ) 

Relationship Indicator = 



V 



J (Overlap ASi ASi *Stnd. Dist, 2 ) 



where N represents the number of first entries in the second correlithm objects and the 
number of second entries in the second significance vectors, AS{ represents the 
significance values in the zth second entry of the second significance vector associated 
with the first record, BSj represents the significance values in the ith second entry of 
the second significance vector associated with the second record, Overlap A si,BSi and 
Overlap A si,ASi each represents an overlap value between the identified significance 
values in the second significance vectors, Stnd. Dist* represents a standard distance 
associated with the first correlithm objects contained in the ith first entries of the 
second correlithm objects, M represents the number of values in the first correlithm 
objects contained in the ith first entries of the second correlithm objects, Aj represents 
the yth value of the first correlithm object contained in the zth first entry of the second 
correlithm object associated with the first record, and Bj represents the yth value of the 
first correlithm object contained in the ith first entry of the second correlithm object 
associated with the second record; and 

OverlapAsi,BSi and OverlapASi,ASi each comprises one of a minimum of the 
identified significance values in the second significance vectors and a product of the 
identified significance values in the second significance vectors. 
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49. The method of Claim 45, wherein; 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; and 

a number of first entries in the second correlithm object and a number of 
second entries in the second significance vector equal a number of unique tokens in 
the plurality of records. 

50. The method of Claim 49, wherein: 

each first entry in the second correlithm object is associated with one of the 
unique tokens; 

each first entry in the second correlithm object is also associated with one 
second entry in the second significance vector; and 

generating the second significance vector comprises: 

determining a number of times that the unique token associated with 
the first entry appears in the first record; 

determining a maximum number of times that the unique token 
associated with the first entry appears in a single record in the plurality of records; 

inserting one or more significance values from the first significance 
vector associated with the unique token into the second entry associated with the first 
entry when the unique token associated with the first entry appears the maximum 
number of times in the first record; 

inserting one or more scaled significance values from the first 
significance vector associated with the unique token into the second entry associated 
with the first entry when the unique token associated with the first entry appears at 
least once but less than the maximum number of times in the first record; and 

inserting one or more zero significance values into the second entry 
associated with the first entry when the unique token associated with the first entry 
does not appear in the first record. 
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5 1 . The method of Claim 33, wherein: 
at least one token comprises a first correlithm object; and 
at least one of the records comprises a second correlithm object, the second 
correlithm object comprising at least one of the first correlithm objects. 
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52. Software for identifying relationships between database records, the 
software embodied on at least one computer readable medium and operable when 
executed to: 

determine a weight associated with each of a plurality of tokens, each token 
contained in at least one of a plurality of records, the plurality of records comprising a 
first record and at least one second record; 

compare at least one second record to the first record; and 
determine at least one relationship indicator based on the comparison and at 
least one of the weights, the at least one relationship indicator identifying a level of 
relationship between the first record and at least one second record. 
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53. A system for identifying relationships between database records, 
comprising: 

means for storing a plurality of records comprising a first record and at least 
one second record, each record comprising at least one of a plurality of tokens; 

means for determining a weight associated with each of the tokens; 

means for comparing at least one second record to the first record; and 

means for determining at least one relationship indicator based on the 
comparison and at least one of the weights, the at least one relationship indicator 
identifying a level of relationship between the first record and at least one second 
record. 
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54. A system for identifying relationships between database records, 
comprising: 

a memory operable to store a plurality of records, each record comprising at 
least one of a plurality of tokens; and 

one or more processors collectively operable to: 

determine a number of times that each token appeal's in the plurality of 

records; 

determine a number of times that all tokens appear in the plurality of 

records; 

determine a weight associated with each of the tokens, each weight 
based at least partially on the number of times that one of the tokens appears in the 
plurality of records and the number of times that all tokens appear in the plurality of 
records; 

generate a token table containing each of the tokens, a token 
representation associated with each token, and the weight associated with each token; 

generate a records table containing one or more token representations 
associated with the one or more tokens contained in each record, the records table also 
identifying a number of times that the one or more tokens appear in each record; and 

generate a records table index containing a location in the records table 
associated with each record and a record score associated with each record. 



55. The system of Claim 54, wherein the weight associated with one of the 
tokens is determined using a formula of: 

Weigh, ..J 

V T0tal T0ken J 

where Count To ken represents the number of times that the token appears in the plurality 
of records, and Total-Tokens represents the number of times that all tokens appear in the 
plurality of records. 
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56. The system of Claim 54, wherein the record score associated with one 
of the records is determined using a formula of: 

k 

Record Score = £ (Weight Tokenk *Count Tokenk ) 

i=l 

where k represents a number of unique tokens associated with the record, Weight Token 
k represents the weight associated with the kth unique token, and Count To ken k 
represents a number of times that the kth unique token appears in the record. 

57. The system of Claim 54, wherein: 

each record is associated with at least one document; and 
the one or more processors are collectively operable to: 

generate a plurality of text files, each text file associated with one of 
the documents and comprising the tokens contained in the associated document; and 

generate the plurality of records, each record associated with one of the 
text files and comprising the tokens contained in the associated text file. 

58. The system of Claim 57, wherein the one or more processors are 
collectively operable to generate one of the records by: 

identifying one-word tokens in one of the text files, the one-word tokens 
comprising individual words in the text file; 

inserting the one-word tokens in the record; 

selecting pairs of one-word tokens in the record, each pair of one-word tokens 
comprising consecutive one-word tokens in the record; 

combining the pairs of one-word tokens to produce two-word tokens; and 
inserting the two-word tokens in the record. 

59. The system of Claim 54, wherein the token representations comprise 
correlithm objects, each correlithm object comprising a plurality of values defining a 
first point in a particular space, the particular space defined by a plurality of 
dimensions and including a plurality of points. 



ATTORNEY'S DOCKET 
066300.0132 



PATENT APPLICATION 



77 



60. The system of Claim 59, wherein: 

a distance between the first point and each of the plurality of points in the 
particular space defines a distribution having a standard deviation; and 

a number of values in the correlithm object associated with one of the tokens 
may be determined using a formula of: 

Number of Values = (~Weight Token * Standard Deviation] 

where Weightxoken represents the weight associated with the token, and Standard 
Deviation represents the standard deviation of the distribution, 

61. The system of Claim 59, wherein each token representation further 
comprises a significance vector. 



62. The system of Claim 61, wherein each significance vector comprises a 
plurality of significance values, each significance value determined using a formula 
of: 



Significance Value 



Weight Token * Standard Deviation 



Number of Values 

where Weight-Token represents the weight associated with the token, Standard 
Deviation represents the standard deviation of the distribution, and Number of Values 
represents a number of values defining the first point in the correlithm object. 

63. The system of Claim 59, wherein: 

the correlithm objects comprises first correlithm objects, each first correlithm 
object associated with a first significance vector; and 

the one or more processors are further collectively operable to generate a 
second correlithm object and a second significance vector associated with each 
record, each second correlithm object comprising at least one of the first correlithm 
objects. 
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64. The system of Claim 63, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; and 

a number of first entries in the second correlithm object and a number of 
second entries in the second significance vector are determined using a formula of: 

j 

Number of Entries = £ (Maximum Instances Toketli ) 

where j represents a number of unique tokens contained in the plurality of records, 
and Maximum Instances-Token i represents a maximum number of times that the ith 
unique token appears in a single record in the plurality of records. 

65. The system of Claim 64, wherein: 

each first entry in the second correlithm object is associated with an instance 
of one of the tokens; 

each first entry in the second correlithm object is also associated with one 
second entry in the second significance vector; and 

the one or more processors are collectively operable to generate the second 
significance vector by: 

determining whether the instance of the token associated with one of 
the first entries appears in the first record; 

inserting one or more non-zero significance values into the second 
entry associated with the first entry when the instance of the token associated with the 
first entry appears in the first record; and 

inserting one or more zero significance values into the second entry 
associated with the first entry when the instance of the token associated with the first 
entry does not appear in the first record. 
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66. The system of Claim 63, wherein: 

the second correlithm object comprises a plurality of first entries and the 
second significance vector comprises a plurality of second entries, at least one first 
entry comprising one of the first correlithm objects; and 

a number of first entries in the second correlithm object and a number of 
second entries in the second significance vector equal a number of unique tokens in 
the plurality of records. 

67. The system of Claim 66, wherein: 

each first entry in the second correlithm object is associated with one of the 
unique tokens; 

each first entry in the second correlithm object is also associated with one 
second entry in the second significance vector; and 

the one or more processors are collectively operable to generate the second 
significance vector by: 

determining a number of times that the unique token associated with 
the first entry appears in the first record; 

determining a maximum number of times that the unique token 
associated with the first entry appears in a single record in the plurality of records; 

inserting one or more significance values from the first significance 
vector associated with the unique token into the second entry associated with the first 
entry when the unique token associated with the first entry appears the maximum 
number of times in the first record; 

inserting one or more scaled significance values from the first 
significance vector associated with the unique token into the second entry associated 
with the first entry when the unique token associated with the first entry appears at 
least once but less than the maximum number of times in the first record; and 

inserting one or more zero significance values into the second entry 
associated with the first entry when the unique token associated with the first entry 
does not appear in the first record. 



ATTORNEY'S DOCKET 
066300.0132 



80 



PATENT APPLICATION 



68. The system of Claim 54, wherein: 
at least one token comprises a first correlithm object; and 
at least one of the records comprises a second correlithm object, the second 
correlithm object comprising at least one of the first correlithm objects. 
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69. A method for identifying relationships between database records, 
comprising: 

determining a number of times that each of a plurality of tokens appears in a 
plurality of records, each record comprising at least one of the plurality of tokens; 

determining a number of times that all tokens appear in the plurality of 
records; 

determining a weight associated with each of the tokens, each weight based at 
least partially on the number of times that one of the tokens appears in the plurality of 
records and the number of times that all tokens appear in the plurality of records; 

generating a token table containing each of the tokens, a token representation 
associated with each token, and the weight associated with each token; 

generating a records table containing one or more token representations 
associated with the one or more tokens contained in each record, the records table also 
identifying a number of times that the one or more tokens appear in each record; and 

generating a records table index containing a location in the records table 
associated with each record and a record score associated with each record. 



70. The method of Claim 69, wherein the weight associated with one of 
the tokens is determined using a formula of: 



Weight = - log 



Count 



Token 



TotaL 



x Token J 

where Countxoken represents the number of times that the token appears in the plurality 
of records, and Totaliokens represents the number of times that all tokens appear in the 
plurality of records. 
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71 . The method of Claim 69, wherein the record score associated with one 
of the records is determined using a formula of: 

k . 

Record Score = £ (Weight Tokenk *Count Tokenk ) 

where k represents a number of unique tokens associated with the record, Weight To ken 
k represents the weight associated with the kth unique token, and Count To ken k 
represents a number of times that the Mi unique token appears in the record. 

72. The method of Claim 69, wherein each record is associated with at 
least one document; and 

further comprising: 

generating a plurality of text files, each text file associated with one of 
the documents and comprising the tokens contained in the associated document; and 

generating the plurality of records, each record associated with one of 
the text files and comprising the tokens contained in the associated text file. 

73. The method of Claim 72, wherein generating one of the records 
comprises: 

identifying one-word tokens in one of the text files, the one-word tokens 
comprising individual words in the text file; 

inserting the one-word tokens in the record; 

selecting pairs of one-word tokens in the record, each pair of one-word tokens 
comprising consecutive one-word tokens in the record; 

combining the pairs of one- word tokens to produce two-word tokens; and 
inserting the two-word tokens in the record. 

74. The method of Claim 69, wherein the token representations comprise 
correlithm objects, each correlithm object comprising a plurality of values defining a 
first point in a particular space, the particular space defined by a plurality of 
dimensions and including a plurality of points. 
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75. The method of Claim 74, wherein: 

a distance between the first point and each of the plurality of points in the 
particular space defines a distribution having a standard deviation; and 

a number of values in the correlithm object associated with one of the tokens 
may be determined using a formula of: 

Number of Values = [~ Weight Token * Standard Deviation] 

where Weighttoken represents the weight associated with the token, and Standard 
Deviation represents the standard deviation of the distribution. 

76. The method of Claim 74, wherein each token representation further 
comprises a significance vector. 

77. The method of Claim 76, wherein each significance vector comprises a 
plurality of significance values, each significance value determined using a formula 
of: 



Significance Value = 



Weight To]cen * Standard Deviation 



Number of Values 

where Weight-Token represents the weight associated with the token, Standard 
Deviation represents the standard deviation of the distribution, and Number of Values 
represents a number of values defining the first point in the correlithm object. 

78. The method of Claim 74, wherein the correlithm objects comprises 
first correlithm objects, each first correlithm object associated with a first significance 
vector; and 

further comprising generating a second correlithm object and a second 
significance vector associated with each record, each second correlithm object 
comprising at least one of the first correlithm objects. 
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79. The method of Claim 78, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; and 

a number of first entries in the second correlithm object and a number of 
second entries in the second significance vector are determined using a formula of: 

j 

Number of Entries = ^ (Maximum Instances Token ) 
1 = 1 

where j represents a number of unique tokens contained in the plurality of records, 
and Maximum Instances-Token i represents a maximum number of times that the ith 
unique token appears in a single record in the plurality of records. 

80. The method of Claim 79, wherein: 

each first entry in the second correlithm object is associated with an instance 
of one of the tokens; 

each first entry in the second correlithm object is also associated with one 
second entry in the second significance vector; and 

generating the second significance vector comprises: 

determining whether the instance of the token associated with one of 
the first entries appears in the first record; 

inserting one or more non-zero significance values into the second 
entry associated with the first entry when the instance of the token associated with the 
first entry appears in the first record; and 

inserting one or more zero significance values into the second entry 
associated with the first entry when the instance of the token associated with the first 
entry does not appear in the first record. 
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8 1 . The method of Claim 78 , wherein: 

the second correlithm object comprises a plurality of first entries and the 
second significance vector comprises a plurality of second entries, at least one first 
entry comprising one of the first correlithm objects; and 
5 a number of first entries in the second correlithm object and a number of 

second entries in the second significance vector equal a number of unique tokens in 
the plurality of records. 



82. The method of Claim 8 1 , wherein: 
10 each first entry in the second correlithm object is associated with one of the 

unique tokens; 

each first entry in the second correlithm object is also associated with one 
second entry in the second significance vector; and 

generating the second significance vector comprises: 
15 determining a number of times that the unique token associated with 

the first entry appears in the first record; 

determining a maximum number of times that the unique token 
associated with the first entry appears in a single record in the plurality of records; 

inserting one or more significance values from the first significance 
20 vector associated with the unique token into the second entry associated with the first 
entry when the unique token associated with the first entry appears the maximum 
number of times in the first record; 

inserting one or more scaled significance values from the first 
significance vector associated with the unique token into the second entry associated 
25 with the first entry when the unique token associated with the first entry appears at 
least once but less than the maximum number of times in the first record; and 

inserting one or more zero significance values into the second entry 
associated with the first entry when the unique token associated with the first entry 
does not appear in the first record. 



30 



ATTORNEY'S DOCKET 
066300.0132 



86 



PATENT APPLICATION 



83. The method of Claim 69, wherein: 
at least one token comprises a first correlithm object; and 
at least one of the records comprises a second correlithm object, the second 
correlithm object comprising at least one of the first correlithm objects. 
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84. Software for identifying relationships between database records, the 
software embodied on at least one computer readable medium and operable when 
executed to: 

determine a number of times that each of a plurality of tokens appears in a 
plurality of records, each record comprising at least one of the plurality of tokens; 

determine a number of times that all tokens appear in the plurality of records; 

determine a weight associated with each of the tokens, each weight based at 
least partially on the number of times that one of the tokens appears in the plurality of 
records and the number of times that all tokens appear in the plurality of records; 

generate a token table containing each of the tokens, a token representation 
associated with each token, and the weight associated with each token; 

generate a records table containing one or more token representations 
associated with the one or more tokens contained in each record, the records table also 
identifying a number of times that the one or more tokens appear in each record; and 

generate a records table index containing a location in the records table 
associated with each record and a record score associated with each record. 
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85. A system for identifying relationships between database records, 
comprising: 

means for storing a plurality of records, each record comprising at least one of 

a plurality of tokens; 

means for determining a number of times that each token appears in the 

plurality of records; 

means for determining a number of times that all tokens appear in the plurality 

of records; 

means for determining a weight associated with each of the tokens, each 
weight based at least partially on the number of times that one of the tokens appears in 
the plurality of records and the number of times that all tokens appear in the plurality 
of records; 

means for generating a token table containing each of the tokens, a token 
representation associated with each token, and the weight associated with each token; 

means for generating a records table containing one or more token 
representations associated with the one or more tokens contained in each record, the 
records table also identifying a number of times that the one or more tokens appear in 
each record; and 

means for generating a records table index containing a location in the records 
table associated with each record and a record score associated with each record. 
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86. A system for identifying relationships between database records, 
comprising: 

a memory operable to store: 

a token table containing each of a plurality of tokens, a token 
representation associated with each token, and a weight associated with each token; 

a records table containing one or more token representations associated 
with one or more tokens contained in each of a first record and a second record, the 
records table also identifying a number of times that the one or more tokens appear in 
each record; and 

a records table index containing a location in the records table 
associated with each record and a record score associated with each record; and 
one or more processors collectively operable to: 

identify the one or more token representations associated with the first 
record and the one or more token representations associated with the second record 
using the records table index and the records table; and 

determine a relationship indicator associated with the second record 
using the identified token representations, the record score associated with the first 
record, and at least one of the weights associated with the token representations, the at 
least one relationship indicator identifying a level of relationship between the first 
record and at least one second record. 

87. The system of Claim 86, wherein the weight associated with one of the 
tokens is determined using a formula of: 



Weight = - log 



^ Count Token 
Total Token j 



where Count To ken represents the number of times that the token appears in the plurality 
of records, and Total To kens represents the number of times that all tokens appear in the 
plurality of records. 
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88. The system of Claim 86, wherein the record score associated with the 
first record is determined using a formula of: 

k 

Record Score - S (Weight Tokenk *Count Tokenk ) 

i-i 

where k represents a number of unique tokens associated with the first record, 
Weight-Token k represents the weight associated with the Ml unique token, and 
Countjoken k represents a number of times that the kth unique token appears in the first 
record. 

89. The system of Claim 86, wherein the one or more processors are 
further collectively operable to: 

identify any common tokens, a common token comprising one of the tokens 
that appears in both the first record and the second record; and 

identify a common count value for each common token, the common count 
value representing a minimum number of times that the common token appears in 
either the first record or the second record. 

90. The system of Claim 89, wherein the relationship indicator associated 
with the second record when compared to the first record is determined using a 
formula of: 

j 

X (Weight Tokenl * Common Count Tokeni ) 
Relationship Indicator = — 

^ COre Target Record 

where j represents a number of common tokens, Weight To ken i represents the weight 
associated with the zth common token, Common Count To ken i represents the common 
count value associated with the zth common token, and ScoreFirst record represents the 
record score associated with the first record. 
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91. The system of Claim 86, wherein: 

a plurality of second records are each associated with at least one document; 
the one or more processors are collectively operable to determine a plurality of 
relationship indicators associated with the plurality of second records; and 
the one or more processors are further collectively operable to: 

select one or more of the second records based on the relationship 

indicators; and 

make the documents associated with the one or more second records 
available to a user. 

92. The system of Claim 86, wherein the one or more processors are 
collectively operable to allow a user to select the first record, wherein selecting the 
first record comprises at least one of selecting one of the plurality of records and 
submitting a document that the one or more processors may use to generate the first 
record. 

93. The system of Claim 86, wherein the one or more processors are 
further collectively operable to: 

receive a query from a user; 

identify one or more records that satisfy the query; 

identify one or more documents associated with the one or more records; and 
make the one or more documents available to the user. 

94. The system of Claim 86, wherein: 

a first correlithm object and a first significance vector are associated with each 
of the tokens, each first correlithm object comprising a plurality of values defining a 
first point in a particular space, the particular space defined by a plurality of 
dimensions and including a plurality of points; and 

a second correlithm object and a second significance vector are associated 
with each of the first record and the second record, each second correlithm object 
comprising at least one of the first correlithm objects. 
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95 . The system of Claim 94 5 wherein: 

a distance between one of the first points and each of the plurality of points in 
one of the particular spaces defines a distribution having a standard deviation; and 

a number of values in one of the first correlithm objects associated with one 
5 of the tokens may be determined using a formula of: 

Number of Values = |"Weight Token * Standard Deviation] 
where Weight To ken represents the weight associated with the token, and Standard 
Deviation represents the standard deviation of the distribution. 

96. The system of Claim 95, wherein one of the first significance vectors 
comprises a plurality of significance values, each significance value determined using 
a formula of: 

Weight Token * Standard Deviation 
Significance Value = Numb er of Values ~ ' 

97. The system of Claim 94, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; and 

each first entry in the second correlithm objects is associated with an instance 
of one of the tokens. 

98. The system of Claim 94, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
25 entry comprising one of the first correlithm objects; and 

each first entry in the second correlithm objects is associated with one of the 
unique tokens. 



10 



15 



20 
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99. The system of Claim 94, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; 

each first entry in the second correlithm object is also associated with one 
second entry in the second significance vector; 

the relationship indicator associated with the second record when compared to 
the first record is determined using a formula of: 



where N represents the number of first entries in the second correlithm objects and the 
number of second entries in the second significance vectors, ASi represents 
significance values in the zth second entry of the second significance vector associated 
with the first record, BSi represents significance values in the zth second entry of the 
second significance vector associated with the second record, Overlap A si,BSi and 
Overlap ASi,ASi each represents an overlap value between the identified significance 
values in the second significance vectors, Stnd. Dist.j represents a standard distance 
associated with the first correlithm objects contained in the ith first entries of the 
second correlithm objects, M represents the number of values in the first correlithm 
objects contained in the zth first entries of the second correlithm objects, Aj represents 
the jth value of the first correlithm object contained in the ith first entry of the second 
correlithm object associated with the first record, and Bj represents the yth value of the 
first correlithm object contained in the zth first entry of the second correlithm object 
associated with the second record; and 

wherein Overlap ASi,BSi and OverlapAsi,ASi each comprises one of a minimum of 
the identified significance values in the second significance vectors and a product of 
the identified significance values in the second significance vectors. 



Relationship Indicator = 
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100. The system of Claim 86, wherein: 
at least one token comprises a first correlithm object; and 
at least one of the records comprises a second correlithm object, the second 
correlithm object comprising at least one of the first correlithm objects. 
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5 v 



101. A method for identifying relationships between database records, 
comprising: 

identifying one or more token representations associated with a first record 
and one or more token representations associated with a second record using a records 
5 table index and a records table, the records table containing the one or more token 
representations associated with one or more tokens contained in each of the records, 
the records table also identifying a number of times that the one or more tokens 
appear in each record, the records table index containing a location in the records 
table associated with each record and a record score associated with each record; and 
1 0 determining a relationship indicator associated with the second record using 

the identified token representations, the record score associated with the first record, 
and at least one of a plurality of weights associated with the token representations, the 
at least one relationship indicator identifying a level of relationship between the first 
record and at least one second record. 



15 



102. The method of Claim 101, wherein the weight associated with one of 
the tokens is determined using a formula of: 

f Count Token x 



T ° tal Toke„ J 



Weight = - log 

v 

where Countxoken represents the number of times that the token appears in the plurality 
of records, and Total To kens represents the number of times that all tokens appear in the 
20 plurality of records. 



103. The method of Claim 101, wherein the record score associated with the 
first record is determined using a formula of: 

k 

Record Score = £ (Weight Token k *Count Tokenk ) 

25 where k represents a number of unique tokens associated with the first record, 
WeightToken k represents the weight associated with the kth unique token, and 
Countxoken k represents a number of times that the kth unique token appears in the first 
record. 



ATTORNEY'S DOCKET 
066300.0132 



96 



PATENT APPLICATION 



1 04. The method of Claim 101, further comprising: 

identifying any common tokens, a common token comprising one of the 
tokens that appears in both the first record and the second record; and 

identifying a common count value for each common token, the common count 
value representing a minimum number of times that the common token appears in 
either the first record or the second record. 

105. The method of Claim 104, wherein the relationship indicator 
associated with the second record when compared to the first record is determined 
using a formula of: 

j 

£ (Weight Tokeni * Common Count Tokeni ) 
Relationship Indicator = — 

awic Target Record 

where j represents a number of common tokens, Weight To ken i represents the weight 
associated with the fth common token, Common Count-Token i represents the common 
count value associated with the ith common token, and Scorepirst record represents the 
record score associated with the first record. 

1 06. The method of Claim 101, wherein: 

a plurality of second records are each associated with at least one document; 
a plurality of relationship indicators associated with the plurality of second 
records are determined; and 
further comprising: 

selecting one or more of the second records based on the relationship 

indicators; and 

making the documents associated with the one or more second records 
available to a user. 
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107. The method of Claim 101, wherein: 

a first correlithm object and a first significance vector are associated with each 
of the tokens, each first correlithm object comprising a plurality of values defining a 
first point in a particular space, the particular space defined by a plurality of 
dimensions and including a plurality of points; and 

a second correlithm object and a second significance' vector are associated 
with each of the first record and the second record, each second correlithm object 
comprising at least one of the first correlithm objects. 



108. The method of Claim 107, wherein: 

a distance between one of the first points and each of the plurality of points in 
one of the particular spaces defines a distribution having a standard deviation; 

a number of values in one of the first correlithm objects associated with one 
of the tokens may be determined using a formula of: 

Number of Values = fWeight Token * Standard Deviation] 

where Weightxoken represents the weight associated with the token, and Standard 
Deviation represents the standard deviation of the distribution; and 

one of the first significance vectors comprises a plurality of significance 
values, each significance value determined using a formula of: 



Significance Value = 



Weight Token * Standard Deviation 
Number of Values 



1 09. The method of Claim 1 07, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; and 

each first entry in the second correlithm objects is associated with an instance 
of one of the tokens. 
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1 1 0. The method of Claim 1 07, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; and 

each first entry in the second correlithm objects is associated with one of the 
unique tokens. 
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111. The method of Claim 1 07, wherein: 

the second correlithm object comprises one or more first entries and the 
second significance vector comprises one or more second entries, at least one first 
entry comprising one of the first correlithm objects; 

each first entry in the second correlithm object is also associated with one 
second entry in the second significance vector; 

the relationship indicator associated with the second record when compared to 
the first record is determined using a formula of: 



where N represents the number of first entries in the second correlithm objects and the 
number of second entries in the second significance vectors, ASj represents 
significance values in the zth second entry of the second significance vector associated 
with the first record, BSi represents significance values in the ith second entry of the 
second significance vector associated with the second record, Overlap A si,BSi and 
Overlap A si,ASi each represents an overlap value between the identified significance 
values in the second significance vectors, Stnd. Dist.i represents a standard distance 
associated with the first correlithm objects contained in the zth first entries of the 
second correlithm objects, M represents the number of values in the first correlithm 
objects contained in the zth first entries of the second correlithm objects, Aj represents 
the y'th value of the first correlithm object contained in the ith first entry of the second 
correlithm object associated with the first record, and Bj represents the jth value of the 
first correlithm object contained in the zth first entry of the second correlithm object 
associated with the second record; and 

wherein Overlap ASi,BSi and Overlap A si,ASi each comprises one of a minimum of 
the identified significance values in the second significance vectors and a product of 
the identified significance values in the second significance vectors. 



£ Overlap ASi , BSi * Stnd. Dist, 2 -£ U - B J 

i = iV V j=i 



N f ( M 



Relationship Indicator = 
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1 12. The method of Claim 101, wherein: 
at least one token comprises a first correlithm object; and 
at least one of the records comprises a second correlithm object, the second 
correlithm object comprising at least one of the first correlithm objects. 
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113. Software for identifying relationships between database records, the 
software embodied on at least one computer readable medium and operable when 
executed to: 

identify one or more token representations associated with a first record and 
one or more token representations associated with a second record using a records 
table index and a records table, the records table containing the one or more token 
representations associated with one or more tokens contained in each of the records, 
the records table also identifying a number of times that the one or more tokens 
appear in each record, the records table index containing a location in the records 
table associated with each record and a record score associated with each record; and 

determine a relationship indicator associated with the second record using the 
identified token representations, the record score associated with the first record, and 
at least one of a plurality of weights associated with the token representations, the at 
least one relationship indicator identifying a level of relationship between the first 
record and at least one second record. 
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114. A system for identifying relationships between database records, 
comprising: 

means for identifying one or more token representations associated with a first 
record and one or more token representations associated with a second record using a 
records table index and a records table, the records table containing the one or more 
token representations associated with one or more tokens contained in each of the 
records, the records table also identifying a number of times that the one or more 
tokens appear in each record, the records table index containing a location in the 
records table associated with each record and a record score associated with each 
record; and 

means for determining a relationship indicator associated with the second 
record using the identified token representations, the record score associated with the 
first record, and at least one of a plurality of weights associated with the token 
representations, the at least one relationship indicator identifying a level of 
relationship between the first record and at least one second record. 
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115. A method for identifying relationships between database records, 
comprising: 

communicating at least one of one or more documents, one or more text files, 
and one or more records to a server, each of the at least one of the documents, the text 
files, and the records comprising at least one of a plurality of tokens; and 
wherein the server is operable to: 

determine a weight associated with each of the tokens; 

compare two of the at least one of the documents, the text files and the 

records; and 

determine a relationship indicator based on the comparison and at least 
one of the weights, the at least one relationship indicator identifying a level of 
relationship between the two of the at least one of the documents, the text files and the 
records. 
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116. A method for identifying relationships between database records, 
comprising: 

communicating at least one of one or more documents, one or more text files, 
and one or more records to an indexing engine, each of the at least one of the 
documents, the text files, and the records comprising at least one of a plurality of 
tokens; and 

wherein the indexing engine is operable to: 

determine a number of times that each token appears in the at least one 
of the documents, the text files, and the records; 

determine a number of times that all tokens appear in the at least one of 
the documents, the text files, and the records; 

determine a weight associated with each of the tokens, each weight 
based at least partially on the number of times that one of the tokens appears in the at 
least one of the documents, the text files, and the records and the number of times that 
all tokens appear in the at least one of the documents, the text files, and the records; 

generate a token table containing each of the tokens, a token 
representation associated with each token, and the weight associated with each token; 

generate a records table containing one or more token representations 
associated with the one or more tokens contained in each of the at least one of the 
documents, the text files, and the records, the records table also identifying a number 
of times that the one or more tokens appear in each of the at least one of the 
documents, the text files, and the records; and 

generate a records table index containing a location in the records table 
associated with each of the at least one of the documents, the text files, and the 
records and a score associated with each of the at least one of the documents, the text 
files, and the records. 
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117. A method for identifying relationships between database records, 
comprising: 

selecting a first record in a plurality of records, each record comprising at least 
one of a plurality of tokens, wherein selecting the first record comprises at least one of 
selecting one of the plurality of records and submitting a document to a server that the 
server may use to generate the first record; and 

wherein the server is operable to: 

compare the first record to at least one other of the plurality of records; 

and 

determine at least one relationship indicator based on the comparison 
and at least one of a plurality of weights associated with the tokens, the at least one 
relationship indicator identifying a level of relationship between the first record and at 
least one other record. 
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118. A method for identifying relationships between database records, 
comprising: 

storing a first correlithm object and second correlithm object, the first and 
second correlithm objects each comprising a plurality of first entries, each first entry 
comprising one or more values; 

storing a first significance vector and a second significance vector, the first 
and second significance vectors each comprising a plurality of second entries, each 
second entry comprising one or more significance values; and 

determining a relationship indicator associated with the first and second 
correlithm objects, the relationship indicator determined using a formula of: 



where N represents a number of first entries in the first and second correlithm objects, 
ASi represents the significance values in the zth second entry of the first significance 
vector, BSi represents the significance values in the zth second entry of the second 
significance vector, Overlap A si,BSi and Overlap A si,ASi each represents one of a 
minimum and a product of the identified significance values, Stnd. Dist.j represents a 
standard distance associated with the zth first entries of the first and second correlithm 
objects, M represents a number of values in the zth first entry of the first and second 
correlithm objects, Aj represents the jth value in the zth first entry of the first 
correlithm object, and Bj represents the yth value in the zth first entry of the second 
correlithm object. 




N f ( M 




Relationship Indicator = 
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119. A method for identifying relationships between database records, 
comprising: 

storing a first correlithm object and a first significance vector, the first 
correlithm object comprising a plurality of first values, the first significance vector 
5 comprising a plurality of first significance values; 

storing a second correlithm object and a second significance vector, the second 
correlithm object comprising a plurality of second values, the second significance 
vector comprising a plurality of second significance values; 

determining a relationship indicator associated with the first and second 
10 correlithm objects, the relationship indicator determined using a formula of: 



Relationship Indicator = 



fl 

0verlap ASi3Si *|^--(A i -B l ) 
Z( Overlap ASj ASj *^ 



where N represents a number of first and second values in the first and second 
correlithm objects, ASi represents the ith first significance value in the first 
significance vector, BSi represents the ith second significance value in the second 
15 significance vector, Overlap A si,BSi and Overlap ASi,ASi each represents one of a 
minimum and a product of the identified significance values, Aj represents the zth first 
value of the first correlithm object, and Bj represents the ith second value of the 
second correlithm object. 
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